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Abstract 

“The non-coding RNA (ncRNA) produces functional RNA 
molecules instead of encoding proteins, however, the 
ncRNAs contain information to perform the function. Most 
genetic information is encoded by proteins while most of the 
genetic information of mammals and other complex 
organisms is transcribed into ncRNAs. The current study 
was designed to predict the ncRNAs in the genome of the 
Enterobacter cloacae complex by employing in silico 
“approaches. Various putative ncRNAs were predicted in four 
different species of Enterobacter cloacae complex. Extensive 
in silico analyses were performed and specific promoters 
were predicted for all the selected ncRNAs. The predicted 
promoter regions were validated for further analyses. The 
selected ncRNA was utilized for secondary structure 
prediction. All the predicted secondary structures were 
validated through various evaluation tools and secondary 
structures were observed suitable. All the selected ncRNAs 
were observed stable and characterized based on hairpin 
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loops, least MFE value and promoter regions. In conclusion, the predicted ncRNAs 
have the ability to perform stable functions. 
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maturation of messenger RNAs (mRNAs), 
ribosomal RNAs (rRNAs), transfer RNAs 
(tRNAs), and X-chromosome inactivation 
in mammals. The genes have been 
discovered in all kingdoms of life to 
regulate ncRNAs including microRNA 
(miRNA) and small interfering RNA 
(siRNA) in eukaryotic cells [6]. Several 
regulatory roles involve the bacteria to be 
acting as antitoxic components in toxin- 
antitoxin systems by bacterial small RNAs 
whereas regulatory ncRNAs adjust 
bacterial physiology with respect to 
environmental cues. These have also been 
discovered in several species throughout 
the bacterial kingdom [7]. The emerging 
main elements of cellular homeostasis 
besides microRNA, ncRNAs are PIWI- 
interacting RNAs (piRNAs), small 
nucleolar RNAs (snoRNAs), transcribed 


ultra-conserved regions (T-UCRs) and 
large intergenic non-coding RNAs 
(lincRNAs). In addition to microRNAs, 
tumorigenesis, neurological, 
cardiovascular, and developmental 
diseases [8] are found to be caused by the 
dysregulation of ncRNAs. Bacteria 


encode an enormous number of small 
non-coding RNAs (sRNAs) that acts to 
modulate gene expression at the post- 
transcriptional level. Many sRNAs often 


control the expression of outer 
membrane proteins (OMPs). 
Enterobacteria (Escherichia coli and 


Salmonella) are now known to encode at 
least eight OMP-regulating sRNAs (InvR, 
MicA, MicC, MicF, OmrAB, RseX and 
RybB). sRNAs act to show up their 
functions under diverse growth and 
stress conditions ncRNAs regulate the 
associated genes [9]. 

In silico methods were utilized to predict 
the novel ncRNAs on basis of general 
features and common characteristics to 
predict ncRNA. Over the last decade, 
progressive improvement has been 
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Introduction 


The non-coding RNAs (ncRNAs) generate 
functional RNA molecules rather than 
translated proteins. ncRNAs are unable to 
encode protein however regulate the 
associative genes. ncRNAs are involved in 
several cellular processes including 
regulation of gene expression, RNA 
modification and editing [1]. In humans, 
approximately 98% of the genome can be 
transcribed and only 2% encodes the 
protein [2], proving the possibility that a 
large amount of the genome may encode 
ncRNA. The predicted numbers of 
ncRNAs have been still unknown. Usually, 
ncRNAs may be functional or may not be 
functional however non-functional 
ncRNAs are referred to as junk RNA [3]. 
The functional RNA molecules are 
components of cellular machines such as 
ribosomes (ribosomal RNAs), the 
spliceosome and telomerase. 
Enterobacter cloacae complex is a gram- 
negative, facultatively anaerobic, rod- 
shaped and non-spore forming bacteria 
that belongs to the family 
Enterobacteriaceae. Many strains of 
bacteria are pathogenic. Enterobacter 
species are 0.6-1 pm in diameter and 1.2- 
3 pm long [4]. 80% of species are 
encapsulated having an optimal growth 
temperature of 30 °C. Upon glucose 
fermentation, the bacteria produce acid. 
Enterobacter species can cause several 
infections, including cerebral abscess, 
pneumonia, meningitis and intestinal 
infections. Enterobacteriaceae family is a 
colonizer of the lower gastrointestinal 
tract of humans and animals. Plants, 
animals, or humans can be their hosts. 

It is found that particularly ncRNAs are 
abundant in roles that require highly 
specific nucleic acid recognition without 
complex catalysis [5]. The processes that 
involve ncRNAs are gene regulation, 
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Validation of ncRNAs 


The validation of the predicted and 
screened ncRNAs was performed. Rfam 
database was used to screen and validate 
the names of all the selected ncRNAs 
[18]. The validation confirms the names 
of already known and unknown genes. 
Rfam- Xfam database performed ncRNA 
sequence search. The names of ncRNA 
sequences were obtained on the basis of 
bits score, e-value and strand. The 
ncRNAs that have unknown names were 
dropped from the experiment. 


tRNA search 


The presence of tRNAs was validated to 
avoid the presence of irrelevant tRNAs. 
tRNAscan-SE [19] was utilized to remove 
irrelevant tRNAs. ncRNA sequences were 
subjected in FASTA format for analyses. 


Promoter Prediction 


Promoter region was predicted for 
ncRNAs. Promoter 2.0 prediction server 
[20] was used to predict the promoter 
regions of ncRNAs. The transcription 
start sites were predicted for vertebrate 
Pol II promoters in nucleotide sequences. 
It has been developed as an evolution of 
simulated transcription factors that 
interact with the sequences in promoter 
regions. FASTA sequences were utilized 
for the analyses. 


Structural 
ncRNAs 


characterization of 


The secondary structures prediction was 
performed to visualize the alpha helices 
and beta pleated sheets of the selected 
ncRNAs. Mfold web server [21] was used 
to predict the secondary structures of the 
selected ncRNAs. RNA folding form 
option was selected on Mfold and linear 
form option, size of interior bulge/loop 
taken as 30, maximum asymmetry of an 
interior bulge/loop also taken as 30, 
folding temp was set at 37 °C for the 
analyses. Numerous structures were 
generated for detailed analyses. 
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observed in the field of computational 
drug design [10, 11] and bioinformatics 
and more opportunities are available to 
understand the biological mechanisms 
[10]. Numerous biological problems have 


been resolved by utilizing various 
bioinformatics approaches [12, 13]. 
Moreover, structural and functional 


bioinformatics have efficient techniques 
to scrutinize and design novel compounds 
against various disorders including 
COVID-19 [14-16]. 


Materials and Methods 


ncRNA Prediction 


ncRNAs were predicted in genomes of E. 
cloacae complex, for this purpose 
RNAspace web server was used [17]. The 
selected server isolates ncRNAs from the 
genome. Initially, the selected genome 
was subjected to the RNAspace web 
server. The species name was utilized 
along with its strain with default 
parameters. Homology search 
parameters were utilized. In comparative 
analyses, the Enterobacter sakazaki 
organism was selected as the selection of 
the same organism showed suitable 
results. FASTA sequence was subjected 
to BLAST for sequence alignment, CG- 
sequence as sequence aggregation, RNAz 
as structure inference for comparative 
analyses methods. The generated results 
were prepared in excel sheet format. 


Screening of ncRNAs 


The manual screening of ncRNAs by 
deletion of nucleotide sequences with 
length less than 75 were done as the 
structures of shorter-length sequences 
was not considered to be stable. The 
removal of ribosomal RNAs and transfer 


RNAs was performed to avoid 
pseudogenes. On the basis of names 
given to  ncRNAs by RNAspace, 


repetitions were removed. The removal of 
the repeats was an important step as the 
purpose of the current study was to 
predict structures. 
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RNAs were also observed. Interestingly, 
numerous known and unknown RNAs 
were also observed and analyzed. IS1222- 
FSE, sroE, t44, MicF, ryfA, and LR-PK1 
known RNAs were also observed from the 
generated set of predicted ncRNAs. 3441 
ncRNA was predicted in Burkholderia 
cenocepacia strain J2315 (Fig. 1). It was 
observed that the predicted ncRNAs vary 
from species to species. 

From all the predicted ncRNAs, 304, 197, 
239 and 208 predicted ncRNAs, 
ribosomal RNAs, transfer RNAs, repeats 
and sequences having a length of >75 
nucleotides were eliminated and 275, 
131, 146 and 142 ncRNAs were retained 
for further analyses. 3441 putative 
ncRNAs were predicted in Burkholderia 
cenocepacia strain J2315 and 213 were 
screened after applying the selected 
parameters. It was also observed that the 
number of tRNA, rRNA and shorter 
sequences may vary from genome to 
genome and the putative ncRNAs of the 
selected genome may have a different 
number of tRNAs and rRNAs. The known 
and unknown scrutinized sequences (275, 
131, 146 and 142 ncRNA sequences) 
were evaluated and analyzed. The names 
of all the scrutinized ncRNAs_ were 
verified and characterized on the basis of 
hairpin loop (Table 1). The false positive 
rRNAs and tRNAs were eliminated. 
Extensive in-silico analyses evidenced 


J K L M N 0 P Q 
E 


GGCGCTTCAGGTGGCTCTGGGGCGAAAGTACTGACGACAGACCAGAAGCGGGAAGCTGTGGTGTTGATGTGTGATGCGACCGGTCTGTCGCAACGTCGTGCCTGCAGGCTTACAGGTT 


AACCCCCTGAAATCTGCAATCAACTTGGCGGAAGGTTCAGATATTCAGGGGGTCATAAAGCAGGCGCGCTGCCTGTGAAGGATTATAACGCATGCACCATATAAAACAACACCCGGCCGCCATA 


1 


GTAGATGCTCATTCCACCTCTTATGTTCGCCTCAGGGCCTCATAAACTCAGGAATGACGCAGAGCCATTTAATGGTGCTTATCGTCCACAGACAGATGTCGCTTCGGCCTCATCAAACACCATGGACACAACGTT 


ACCGACGATGGCGACTACCAGGTAAAACTCCGCAGCCTGGTTCGCTTTCTGGAAGAGGGTGATAAAGCTAAGATCACACTGCGTTTCCGCGGTCGTGAGATGGCTCACCAGCAGATTGGTATGGAAGTGCTT 


Functional characterization of 
ncRNAs 

ncRNAs having stable structures are 
considered suitable with proper 


functionality. The predicted structures 
were verified for their stability. MFE 
value, bulges, hairpin loops, pseudoknots, 
stem junctions, cross pairing and 
overlapping were verified. ncRNAs were 
unable to perform direct function 
however can regulate the attached genes. 
In order to check for ncRNA function, 
associative genes were predicted. BLAST 
[22] was performed to screen the 
associative genes. The nucleotide BLAST 
option of NCBI BLAST was used and 
stable structured ncRNA sequences were 
observed. Total score and identity were 
observed for the final selection of the 
selected ncRNAs. The predicted 
associative genes were verified through 
GenBank. The function of the known gene 
was verified by using UniprotKB [23]. 


Results and Discussion 


Numerous bioinformatics analyses were 
performed to screen novel ncRNAs. 304, 
197, 239 and 208 ncRNAs were predicted 
in the intergenic regions of the E. kobei, 
E. asburiae, E. cancerogenus and E. 
hormaechei genomes respectively. The 
observed results were analyzed for 
further analysis. From all the predicted 
ncRNAs, ribosomal RNAs and transfer 


A B € D E F G H 
>000001|1S1222_FSE| bacteria | kobei | unknown | unknown | +| 51836| 51953 


>000003 | PK-repBA| bacteria | kobei | unknown | unknown | + | 68144 | 68267 
>000010 | GImZ_SralJ | bacteria | kobei | unknown | unknown | +| 4322361 | 4322551 


>000013 | LR-PK1 | bacteria | kobei | unknown | unknown | + | 1974747 | 1975084 


یم یم یں طب س وه له 6۵ 


ryfA| bacteria | kobei | unknown | unknown |+| 1399697 | 1399874 AT‏ | 000014> و 


10 GCGTCCCTTTCCGCCATCTCGCAAATGGGCACCGATCCAGGGAAAGGATTATCCACAACCGTAATCAGGCACTATTCCGTGCTTGCATCCGCCGAATGATCATCGGTGGTGAGACGGTGGAGCGGTTTTCAGC 


11 >000018 | sroD | bacteria | kobei | unknown | unknown | -| 2522839 | 3 


12 TTGCGTGACGAAGCCCGCGCCAAAGTAGACAATAAAGTCTGAGCTTTGAGTAAGTCGCCTGACGCCGGTTAGCCGGCG A 


13 >000019|sroE| bacteria | kobei | unknown | unknown | - | 3293880 


14 ATAACGTGACTGGGAAGCGGCTTGCTTCCCGTGTATGATTGAACCCGCAGCGCGCCCGGCAGGTCAGGGTGAGCGCTAAGGGTTCA AT 


15 >000030 | SraG | bacteria | ۵۵6 | unknown | unknown | +| 3999129 | 399929 


16 CTTCTGTGCATCCTCGCGACTAATGACAACCCTAACCCAAACCGGGTAAAGCCTCTCATTAGCCGCGCGAACCTCTGCAACGAAGATCATTCATAGCAACAATACAATAGTTTAGGGTGAATTGCTGCCGTCTG( 


r 


17 >000033|Mg_sensor| bacteria | kobei | unknown | unknown | +| 543947 |544058 


18 TTACCGGAGGCAACATGGATCCTGATCCCACCCCTCTCCCCGACGGGAGTTTTCCCGCGTCCCGGTAAGCCAGTTCTCGCTGCCTTGCCAGACGCGTAAGGCAGCGACGTTT 


19 >000034 | MicF | bacteria | kobei | unknown | unknown | +| 3070405 | 307049 


20 CGCTATCATCATTAAC ATTTATTACCTTCATTCGGCTTCGAATGACTGTTTACCCCTATTACAACCGGATGCCCTGCATTCGG 


Fig. 1: Top ranked 20 predicted ncRNAs 
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overlapping, pseudoknots and stem 
junctions (Fig. 2). The stability of the 
structures was directly proportional to 
the least MFE value. The secondary 
structures were predicted for the 
screened 4 ncRNAs of B. cenocepecia 
strain J2315. ncRNAs in different 
genomes were eliminated by having a 
different number of screened ncRNAs for 
stable structures. 

It was observed that unknown 230, 
1$1222-FSE and isrK have 4 hairpin- 
loops, a single nucleotide bulge, three 
nucleotide bulge, cross-pair, and a least 
MFE value of -28.60 followed by 
overlapping. 

There were 2 stable structures observed 
among 259 ncRNA structures of E. kobei. 
The stability of the predicted ncRNAs was 
verified on the basis of selected 
parameters. The stable structures 
satisfied the selected parameters. The 
stable structures showed hairpin loops, 
bulges, pseudoknots, stem junctions, and 
overlapping and cross pairs. The stable 
structures showed the least MFE values 
of -62.60 and -73.10 (Fig. 3A and 3B). 
Only one stable structure was observed 
among 131 ncRNAs from scrutinized 
sequences in E. asburiae (Fig. 3C). The 
stable structure showed free hairpin 
loops, bulges, pseudoknots, stem 
junctions, overlapping and cross pairs 
along with least MFE value of -20.90. 2 
stable structures were observed in E. 
cancerogenus among 146 ncRNAs. The 
stable structures showed free of hairpin 
loops, bulges, pseudoknots, stem 
junctions, overlapping and cross pairs 
along with least MFE values of -27.20 and 
-67.20 (Fig. 3D and 3F). Only one stable 
structure was observed in E. hormaechei 
among 142 ncRNAs having lowest MFE 
value of -58.30 (Figure 3E). All the 
selected stable structures were analyzed 
for potent functions. ncRNAs have 
indirect functional capacity. 
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Table 1: The names of all the scrutinized 


ncRNAs 

Sr # Name Rfam Base 

01 1S1222- 1S1222- 117 
FSE FSE 

003 PK-repBA Pk-repBA 123 

010 GImZ- GImy-tke1 190 
Sraj 

013 LR-PK1 LR-PK1 337 

014 ryfA ryfA 177 
018 sroD sroD 84 
019 sroE 9101۳ 92 
030 SraG SraG 164 
033 Mg senso Mg sensor 111 

r 

034 MicF MicF 86 

035 Unknown Unknown 136 
040 Unknown isrK 76 
043 DsrA DsrA 78 
044 Unknown RybB 79 
045 SraC_Rye RyeB 133 

A 
051 Unknown Unknown 99 
053 Unknown OmrA-B 76 


that 259, 131, 146 and 142 sequences 
have the potential to act as ncRNAs. 


The predicted promoters have complex 
centers to work as transcriptional 
initiators and initiate the transcription 
process for the conversion of DNA into 
RNA. It was observed that there was no 
promoter attached to the selected refined 
259, 131, 146 and 142 ncRNA sequences. 
The promoter prediction for ncRNAs in E. 
coli resulted in the identification of 
promoters attached to ncRNAs. Extensive 
comparative analyses showed that the 
promoters were attached to ncRNA. 

It was observed that all the screened (259 
ncRNAs of E. kobei, 131 ncRNAs of E. 
asburiae, 146 ncRNAs of E. cancerogenus 
and 142 ncRNAs of E. hormaechei) 
ncRNAs have unstable structures. The 
stability of the screened structures 
depends on numerous factors including 
minimum free energy, number of bulges, 
number of hairpin loops, cross pairing, 
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Fig. 3: Stable structures of ncRNAs: (A) E. Kobie Unknown 472 (B) E. Kobie Unknown 290 
(C) E. Asburiae Unknown 110 (D) E. Cancerogenus Unknown 175 (E) E. Hormaechei 097 


it cures viral diseases by breaking viral 
DNA/RNA. Another stable ncRNA 
structure (unknown472) was also 
analyzed and a secondary structure was 
predicted. The associated genes of 
ncRNAs of B. thuringiensis were cross- 
verified for further analyses. 

Enterobacter asburiae showed one stable 
structure (unknown110) (Table 3). 
Enterobacter cancerogenus showed two 
stable structures (Table 4). Enterobacter 


(F) E. Cancerogenus Unknown 065. 

Detailed analyses were performed to 
scrutinize the genes having attached 
ncRNAs. Enterobacter kobei showed the 
stable structure of ncRNA (unknown290) 
attached with ycjD. The function of ycjD 
was predicted through in silico analyses. 
ycjD encoded DNA-cytosine 
Methyltransferase to improve 
endonuclease activity. The key function of 
ycjD was to break down the nucleic acid 
strings and enhance the hydrolysis of 


ester linkages within nucleic acids (Table hormaechei showed only one stable 
2). Endonuclease activity has research structure (unknown097) and its 
applications in marker and primer associative gene (Table 5). 
designing. Moreover 
Table 2: Functional prediction of stable structure of E. kobei 
Sr. # Name ncRNA Sequence Gene Protein Function 
TITICCCCTCACCCCGACCCTCTCCCCATG 
1 Unknown290 GGAGAGGGAGAACACCGGACCATTCCCTCT ycjD DNA-cytosine Endonuclease 
CCCTACGGGAGAGGGCCAGGGTGAGGGG Methyltransfera activity 
se 
2 Unknown472 ATTGCCCCTCACCCCGGCCCTCTCCCCTCG Unknown Unknown Unknown 


GGAGAGGGGGAAATACGGGTACAAACGATC 
CCCTCTCCCCTCGGGAGAGGGTTAGGGTGA 
GGGGTT 


Table 3: Functional prediction of stable structure E. asburiae 
Sr.# Name ncRNA Sequence 
GATGAAAATTTTTACCATATAAATTACACACAGTGAA 


Gene Protein Function 


1 Unknown11 AATTATCATCAAAAACCAGGAAGCCGATCATACTTTT Unknown Unknown Unknown 
0 TCAAAATGACTGGCATCTTTCCCCTCCTTTCCGCCAC 
ACT 


Table 4: Functional prediction of stable structure of E. cancerogenus 


Sr. # Name ncRNA Sequence Gene Protein Function 
CCCGTCCCTCTAAAGGGTTATAGCGT 
1 Unknown065 CGTTTATAAGATGCATTTAATATGCAT 0 Arsenate reductase 


CTTATATTATTGATGATGAGGTAACTG 
CT 


2 Unknown175 GCCCGGTGGCGCTACGCTTACCGGG dnaA Chromosomal replication initiator 


CCTACGGGAAACCACAAATTCTGTAG protein 
GCCGGGTAAGCGTAGCGCCACCCGG 
CATT 
Table 5: Functional prediction of stable structures of E. hormaechei 
Sr. # Name ncRNA Sequence Gene Protein Function 
TGCCCGGTGGCGCTGCGCTTACCGGGC 
1 Unknown097 CTACATATGCAGTATTTGTAGGCCGGGT unknown unknown unknown 
AAGCGAAGCGCCACCCGGCGTTGTT 
Enterobacter cloacae complex (ECC) infections. Broad-spectrum antibiotic 
includes common nosocomial pathogens resistance, including the recent 


capable of producing a wide variety of emergence of resistance to last-resort 
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carbapenems, has led to increased 
interest in this group of organisms and 
carbapenem-resistant E. cloacae complex 
(CREC) in particular. 


Conclusion 


In conclusion, novel ncRNAs were 
screened and secondary structures were 
predicted. The function of the scrutinized 
ncRNAs was also predicted. The novel 
ncRNAs were characterized on the basis 
of number of nucleotide, hairpin loops 
and least MFE values. The genome of 
Enterobacter cloacae complex, showed 
stable ncRNAs sequences. 
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